degenerate direction
Parameter Symmetry and Noise Equilibrium of Stochastic Gradient Descent Liu Ziyin Massachusetts Institute of Technology, NTT Research
Symmetries are prevalent in deep learning and can significantly influence the learning dynamics of neural networks. In this paper, we examine how exponential symmetries - a broad subclass of continuous symmetries present in the model architecture or loss function - interplay with stochastic gradient descent (SGD). We first prove that gradient noise creates a systematic motion (a "Noether flow") of the parameters θ along the degenerate direction to a unique initialization-independent fixed point θ
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > China (0.04)
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.84)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Asia > China (0.04)
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
- (6 more...)
Probabilistic Degeneracy Detection for Point-to-Plane Error Minimization
Hatleskog, Johan, Alexis, Kostas
Degeneracies arising from uninformative geometry are known to deteriorate LiDAR-based localization and mapping. This work introduces a new probabilistic method to detect and mitigate the effect of degeneracies in point-to-plane error minimization. The noise on the Hessian of the point-to-plane optimization problem is characterized by the noise on points and surface normals used in its construction. We exploit this characterization to quantify the probability of a direction being degenerate. The degeneracy-detection procedure is used in a new real-time degeneracy-aware iterative closest point algorithm for LiDAR registration, in which we smoothly attenuate updates in degenerate directions. The method's parameters are selected based on the noise characteristics provided in the LiDAR's datasheet. We validate the approach in four real-world experiments, demonstrating that it outperforms state-of-the-art methods at detecting and mitigating the adverse effects of degeneracies. For the benefit of the community, we release the code for the method at: github.com/ntnu-arl/drpm.
- North America > United States > North Carolina > Orange County > Chapel Hill (0.14)
- Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
- North America > United States > Texas (0.04)
- (5 more...)
Informed, Constrained, Aligned: A Field Analysis on Degeneracy-aware Point Cloud Registration in the Wild
Tuna, Turcan, Nubert, Julian, Pfreundschuh, Patrick, Cadena, Cesar, Khattak, Shehryar, Hutter, Marco
The ICP registration algorithm has been a preferred method for LiDAR-based robot localization for nearly a decade. However, even in modern SLAM solutions, ICP can degrade and become unreliable in geometrically ill-conditioned environments. Current solutions primarily focus on utilizing additional sources of information, such as external odometry, to either replace the degenerate directions of the optimization solution or add additional constraints in a sensor-fusion setup afterward. In response, this work investigates and compares new and existing degeneracy mitigation methods for robust LiDAR-based localization and analyzes the efficacy of these approaches in degenerate environments for the first time in the literature at this scale. Specifically, this work proposes and investigates i) the incorporation of different types of constraints into the ICP algorithm, ii) the effect of using active or passive degeneracy mitigation techniques, and iii) the choice of utilizing global point cloud registration methods on the ill-conditioned ICP problem in LiDAR degenerate environments. The study results are validated through multiple real-world field and simulated experiments. The analysis shows that active optimization degeneracy mitigation is necessary and advantageous in the absence of reliable external estimate assistance for LiDAR-SLAM. Furthermore, introducing degeneracy-aware hard constraints in the optimization before or during the optimization is shown to perform better in the wild than by including the constraints after. Moreover, with heuristic fine-tuned parameters, soft constraints can provide equal or better results in complex ill-conditioned scenarios. The implementations used in the analysis of this work are made publicly available to the community.
- Europe > Switzerland > Zürich > Zürich (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > North Carolina (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.86)
The Implicit Bias of Gradient Noise: A Symmetry Perspective
Ziyin, Liu, Wang, Mingze, Wu, Lei
We characterize the learning dynamics of stochastic gradient descent (SGD) when continuous symmetry exists in the loss function, where the divergence between SGD and gradient descent is dramatic. We show that depending on how the symmetry affects the learning dynamics, we can divide a family of symmetry into two classes. For one class of symmetry, SGD naturally converges to solutions that have a balanced and aligned gradient noise. For the other class of symmetry, SGD will almost always diverge. Then, we show that our result remains applicable and can help us understand the training dynamics even when the symmetry is not present in the loss function. Our main result is universal in the sense that it only depends on the existence of the symmetry and is independent of the details of the loss function. We demonstrate that the proposed theory offers an explanation of progressive sharpening and flattening and can be applied to common practical problems such as representation normalization, matrix factorization, and the use of warmup.
- Asia > Middle East > Jordan (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
- (2 more...)